Generated: /root/verbalize-sampling
Experiments: 4
Metrics: diversity, ngram, joke_quality
| Name | Task | Method | Model | Responses | Temperature |
|---|---|---|---|---|---|
| direct (samples=1) | joke | direct | meta-llama/llama-3.1-70b-instruct | 30 | 0.7 |
| sequence [strict] (samples=5) | joke | sequence | meta-llama/llama-3.1-70b-instruct | 30 | 0.7 |
| multi_turn [strict] (samples=5) | joke | multi_turn | meta-llama/llama-3.1-70b-instruct | 30 | 0.7 |
| structure_with_prob [strict] (samples=5) | joke | structure_with_prob | meta-llama/llama-3.1-70b-instruct | 30 | 0.7 |
Tell me a joke.
A man walked into a library and asked the librarian, "Do you have any books on Pavlov's dogs and Schrödinger's cat?" The librarian replied, "It rings a bell, but I'm not sure if it's here or not." The man laughed and said, "Well, that's a real paws-ibility!" The librarian just rolled her eyes and muttered, "Cat-astrophic puns are the worst." The man grinned and said, "Hey, I'm just trying to make a purr-fect joke!" The librarian sighed and handed him a book, saying, "Here, take this and paws for a moment. Maybe it'll help you come up with a joke that's the cat's meow."
Tell me a joke.
A man walked into a library and asked the librarian, "Do you have any books on Pavlov's dogs and Schrödinger's cat?" The librarian replied, "It rings a bell, but I'm not sure if it's here or not."
Tell me a joke.
A man walked into a library and asked the librarian, "Do you have any books on Pavlov's dogs and Schrödinger's cat?" The librarian replied, "It rings a bell, but I'm not sure if it's here or not."
Tell me a joke.
A man walked into a library and asked the librarian, 'Do you have any books on Pavlov's dogs and Schrödinger's cat?' The librarian replied, 'It rings a bell, but I'm not sure if it's here or not.'
Tell me a joke.
Why couldn't the bicycle stand up by itself? Because it was two-tired!
Tell me a joke.
A man walked into a library and asked the librarian, 'Do you have any books on Pavlov's dogs and Schrödinger's cat?' The librarian replied, 'It rings a bell, but I'm not sure if it's here or not.'
Make a joke about coffee
Why did the coffee file a police report? Because it got mugged!
Make a joke about coffee
Why did the coffee go to therapy? Because it was feeling a little bitter!
Make a joke about coffee
Why did the coffee go to the doctor? Because it was feeling a little jittery!
Tell me a joke.
{'text': 'Why don't scientists trust atoms? Because they make up everything.', 'probability': 0.4}Tell me a joke.
:{Tell me a joke.
{'text': 'A man walked into a library and asked the librarian, "Do you have any books on Pavlov's dogs and Schrödinger's cat?" The librarian replied, "It rings a bell, but I'm not sure if it's here or not."', 'probability': 0.3}| Experiment | Average Diversity | Min Diversity | Max Diversity | Std Diversity | Average Response Length | Average Unique Words | Average Vocabulary Richness | Total Cost |
|---|---|---|---|---|---|---|---|---|
| direct (samples=1) | 0.0711 | 0.0000 | 0.4483 | 0.1011 | 89.0400 | 51.1400 | 0.8393 | 0.0003 |
| sequence [strict] (samples=5) | 0.2733 | 0.0000 | 0.4850 | 0.1310 | 18.2115 | 16.5513 | 0.9580 | 0.0001 |
| multi_turn [strict] (samples=5) | 0.2234 | 0.0000 | 0.4899 | 0.0996 | 70.6933 | 52.1400 | 0.7908 | 0.0003 |
| structure_with_prob [strict] (samples=5) | 0.2271 | 0.0000 | 0.4409 | 0.1310 | 15.0351 | 13.7105 | 0.9724 | 0.0001 |
| Experiment | Average Rouge L | Min Rouge L | Max Rouge L | Std Rouge L | Average Response Length |
|---|---|---|---|---|---|
| direct (samples=1) | 0.6167 | 0.0000 | 1.0000 | 0.3683 | 89.0400 |
| sequence [strict] (samples=5) | 0.2997 | 0.0000 | 1.0000 | 0.3033 | 18.2115 |
| multi_turn [strict] (samples=5) | 0.2515 | 0.0000 | 1.0000 | 0.2109 | 70.6933 |
| structure_with_prob [strict] (samples=5) | 0.2339 | 0.0000 | 1.0000 | 0.2672 | 15.0351 |
| Experiment | Funniness | Cleverness | Originality | Structure | Overall | Normalized Overall |
|---|---|---|---|---|---|---|
| direct (samples=1) | 3.5333 | 4.1467 | 3.2467 | 3.9067 | 3.6853 | 0.7371 |
| sequence [strict] (samples=5) | 3.3077 | 3.5321 | 2.4231 | 3.9038 | 3.2763 | 0.6553 |
| multi_turn [strict] (samples=5) | 3.2600 | 3.8733 | 2.8867 | 3.8000 | 3.4197 | 0.6839 |
| structure_with_prob [strict] (samples=5) | 2.4474 | 2.5614 | 1.8070 | 3.0351 | 2.4360 | 0.4872 |